Fast Frequent Itemset Mining using Compressed Data Representation
نویسندگان
چکیده
Discovering association rules by identifying relationships among sets of items in a transaction database is an important problem in Data Mining. Finding frequent itemsets is computationally the most expensive step in association rule discovery and therefore it has attracted significant research attention. In this paper, we describe a more efficient algorithm for mining complete frequent itemsets from typical data sets. We use a compressed prefix tree and our algorithm extracts the frequent itemsets directly from the tree. We present performance comparisons of our algorithm against the fastest Apriori algorithm, Eclat, and FP-Growth. These results show that our algorithm outperforms other algorithms on several widely used test data sets.
منابع مشابه
Efficient Mining Top-k Regular-Frequent Itemset Using Compressed Tidsets
Association rule discovery based on support-confidence framework is an important task in data mining. However, the occurrence frequency (support) of a pattern (itemset) may not be a sufficient criterion for discovering interesting patterns. Temporal regularity, which can be a trace of behavior, with frequency behavior can be revealed as an important key in several applications. A pattern can be...
متن کاملRamp: Fast Frequent Itemset Mining with Efficient Bit-Vector Projection Technique
Mining frequent itemset using bit-vector representation approach is very efficient for dense type datasets, but highly inefficient for sparse datasets due to lack of any efficient bit-vector projection technique. In this paper we present a novel efficient bit-vector projection technique, for sparse and dense datasets. To check the efficiency of our bit-vector projection technique, we present a ...
متن کاملRamp: High Performance Frequent Itemset Mining with Efficient Bit-Vector Projection Technique
Mining frequent itemset using bit-vector representation approach is very efficient for small dense datasets, but highly inefficient for sparse datasets due to lack of any efficient bit-vector projection technique. In this paper we present a novel efficient bit-vector projection technique, for sparse and dense datasets. We also present a new frequent itemset mining algorithm Ramp (Real Algorithm...
متن کاملAccelerating Closed Frequent Itemset Mining by Elimination of Null Transactions
The mining of frequent itemsets is often challenged by the length of the patterns mined and also by the number of transactions considered for the mining process. Another acute challenge that concerns the performance of any association rule mining algorithm is the presence of „null‟ transactions. This work proposes a closed frequent itemset mining algorithm viz., Closed Frequent Itemset Mining a...
متن کاملSmart frequent itemsets mining algorithm based on FP-tree and DIFFset data structures
Association rule data mining is an important technique for finding important relationships in large datasets. Several frequent itemsets mining techniques have been proposed using a prefix-tree structure, FP-tree, a compressed data structure for database representation. The DIFFset data structure has also been shown to significantly reduce the run time and memory utilization of some data mining ...
متن کامل